The Benchmark module aims to evaluate the
performance of Signature Search Methods (SSMs) through
pharmacotranscriptomic datasets and drug annotations. The results could
tell us which method and topN (the number of gene used for SS) are the
best.
In this module, you can evaluate Signature Search Methods (SSMs) based
on signatures and well-annotated drugs in LINCS.
Please follow the steps below to perform job:
1. Select a pharmacotranscriptomic dataset
(PTD).
2. Select at least two SSMs for testing
3. Upload a oncogenic signature (OGS) representing a specific
cancer.
4. Upload at least one type Drug annotation file indicating the
drug’s efficacy in experiments or its approval status in clinic (e.g.,
FDA-approved).
SSP also offer a package of demo files in Info-data page.
Pharmacotranscriptomic dataset (PTD), originally
sourced from LINCS L1000 (GSE92742), is presented in an n × m matrix
where n represents the drug names and m denotes a list of gene symbols
along with their corresponding log2FC values.
The SSP incorporates PTDs encompassing nine tumor cell lines across
various concentrations and treatment durations.
Generally, it is advisable for users to select a PTD rich in drugs and
closely related to the cancer of interest. PTDs obtained at 6 hours with
10μM or at 24 hours with 10μM are commonly recommended.
Utilizing multiple datasets is also recommended for users to enhance the
reliability of the outcomes. Drugs that are consistently ranked highly
across different concentrations and treatment durations within PTDs have
a higher likelihood of being effective.
A oncogenic signature (OGS) is a gene list header with Gene and Log2FC. Notably, SSP accepts the genes in the format of gene symbol and assumes that input OGS are statistically significant (p < 0.05), ensuring their relevance for further analysis.
Should your OGS contain genes formatted with alternative identifiers (such as EntrezID, Ensembl, UniProt, Gene name, etc.), proceed to the Converter page for the necessary conversion.
Notably, the SSP employs two drug annotation metrics, AUC and ES, with users required to select at least one for the performance assessment of SSMs. Annotating every drug is impractical; nonetheless, an increased number of annotations leads to more precise outcomes.
1. AUC: The AUC aims to assess whether the SSM can
effectively distinguish between effective and ineffective drugs, with
a higher score indicating a better method. This metric
is applicable to drug annotations derived from large-scale experimental
screenings, such as determining drug efficacy based on whether
IC50 value < 10μM.
We recommend that the file comprises at least 50 drugs
and clearly denotes both two categories of labels: ‘Effective’
and ‘Ineffective’.).
The file should appear as follows (header with
Compound.name and Group) and upload in
step 4a:
Subsequently, drug scores will be computed and ranked based on the confusion matrix using the Area Under the Curve (AUC), with a higher AUC signifying superior performance.
2. ES: The ES aims to evaluate whether effective drugs can
be well enriched at the top of the descending ordered drug list based on
scores, with a negative score indicating a better
method. This metric is suitable for drug annotations based on
clinical practice (for example, FDA-approved drugs), where a minority of
drugs are known to be effective, and the efficacy of the majority
remains uncertain.
We recommend that the file comprises at least 10
drugs.
The file should appear as follows and upload in step 4b:
SSP will calculate drug scores and determine the enrichment score (ES), with a lower ES indicating enhanced performance.
Additionally, the SSP provides a primary annotated drug list accessible through the Annotation Module.
Ultimately, initiate the process by clicking ‘Run,’ which will generate a job ID (jobid) prefixed with ‘BEN’. Results are typically available within approximately 15~30 minutes, and a ‘Quick Tip’ will be provided to assist with result interpretation. Alternatively, you may close the page and retrieve the job ID in the Job Center for future result retrieval.